<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>Performance</title>
<link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../index.html" title="Chapter&#160;1.&#160;Fiber">
<link rel="up" href="../index.html" title="Chapter&#160;1.&#160;Fiber">
<link rel="prev" href="integration/deeper_dive_into___boost_asio__.html" title="Deeper Dive into Boost.Asio">
<link rel="next" href="performance/tweaking.html" title="Tweaking">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td>
<td align="center"><a href="../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="integration/deeper_dive_into___boost_asio__.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="performance/tweaking.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="fiber.performance"></a><a class="link" href="performance.html" title="Performance">Performance</a>
</h2></div></div></div>
<div class="toc"><dl class="toc"><dt><span class="section"><a href="performance/tweaking.html">Tweaking</a></span></dt></dl></div>
<p>
      Performance measurements were taken using <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">highresolution_clock</span></code>,
      with overhead corrections. The code was compiled with gcc-6.20, using build
      options: variant = release, optimization = speed. Tests were executed on Intel
      Core i7-4770S 3.10GHz, 4 cores with 8 hyperthreads (4C/8T), running Linux (4.7.0/x86_64).
    </p>
<p>
      Measurements headed 1C/1T were run in a single-threaded process.
    </p>
<p>
      The <a href="https://github.com/atemerev/skynet" target="_top">microbenchmark <span class="emphasis"><em>syknet</em></span></a>
      from Alexander Temerev was ported and used for performance measurements. At
      the root the test spawns 10 threads-of-execution (ToE), e.g. actor/goroutine/fiber
      etc.. Each spawned ToE spawns additional 10 ToEs ... until 100000 ToEs are
      created. ToEs return back ther ordinal numbers (0 ... 99999), which are summed
      on the previous level and sent back upstream, until reaching the root. The
      test was run 10-20 times, producing a range of values for each measurement.
    </p>
<div class="table">
<a name="fiber.performance.time_per_actor_erlang_process_goroutine__other_languages___average_over_100_000_"></a><p class="title"><b>Table&#160;1.1.&#160;time per actor/erlang process/goroutine (other languages) (average over
      100,000)</b></p>
<div class="table-contents"><table class="table" summary="time per actor/erlang process/goroutine (other languages) (average over
      100,000)">
<colgroup>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
              <p>
                Haskell | stack-1.0.4
              </p>
            </th>
<th>
              <p>
                Erlang | erts-7.0
              </p>
            </th>
<th>
              <p>
                Go | go1.6.1 (GOMAXPROCS == default)
              </p>
            </th>
<th>
              <p>
                Go | go1.6.1 (GOMAXPROCS == 8)
              </p>
            </th>
</tr></thead>
<tbody><tr>
<td>
              <p>
                0.32 &#181;s
              </p>
            </td>
<td>
              <p>
                0.64 &#181;s - 1.21 &#181;s
              </p>
            </td>
<td>
              <p>
                1.52 &#181;s - 1.64 &#181;s
              </p>
            </td>
<td>
              <p>
                0.70 &#181;s - 0.98 &#181;s
              </p>
            </td>
</tr></tbody>
</table></div>
</div>
<br class="table-break"><p>
      <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">thread</span></code> can not be tested at this time (C++14)
      because the API does not allow to set thread stack size (idefault on Linux
      2MB, on Windows 1MB). An out-of-memory error is likely. With pthreads the stack
      size is set 8kBC.
    </p>
<div class="table">
<a name="fiber.performance.time_per_thread__average_over__10_000____unable_to_spawn_100_000_threads_"></a><p class="title"><b>Table&#160;1.2.&#160;time per thread (average over *10,000* - unable to spawn 100,000 threads)</b></p>
<div class="table-contents"><table class="table" summary="time per thread (average over *10,000* - unable to spawn 100,000 threads)">
<colgroup>
<col>
<col>
</colgroup>
<thead><tr>
<th>
              <p>
                pthread
              </p>
            </th>
<th>
              <p>
                <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">thread</span></code>
              </p>
            </th>
</tr></thead>
<tbody><tr>
<td>
              <p>
                14.4 &#181;s - 20.8 &#181;s
              </p>
            </td>
<td>
              <p>
                18.8 &#181;s - 28.1 &#181;s
              </p>
            </td>
</tr></tbody>
</table></div>
</div>
<br class="table-break"><p>
      The test utilizes 4 cores with Symmetric MultiThreading enabled (8 logical
      CPUs). The fiber stacks are allocated by <a class="link" href="stack.html#class_fixedsize_stack"><code class="computeroutput">fixedsize_stack</code></a>.
    </p>
<p>
      As the benchmark shows, the memory allocation algorithm is significant for
      performance in a multithreaded environment. The tests use glibc&#8217;s memory allocation
      algorithm (based on ptmalloc2) as well as Google&#8217;s <a href="http://goog-perftools.sourceforge.net/doc/tcmalloc.html" target="_top">TCmalloc</a>
      (via linkflags="-ltcmalloc").<a href="#ftn.fiber.performance.f0" class="footnote" name="fiber.performance.f0"><sup class="footnote">[9]</sup></a>
    </p>
<p>
      The <a class="link" href="scheduling.html#class_shared_work"><code class="computeroutput">shared_work</code></a> scheduling algorithm uses one global queue,
      containing fibers ready to run, shared between all threads. The work is distributed
      equally over all threads. In the <a class="link" href="scheduling.html#class_work_stealing"><code class="computeroutput">work_stealing</code></a> scheduling
      algorithm, each thread has its own local queue. Fibers that are ready to run
      are pushed to and popped from the local queue. If the queue runs out of ready
      fibers, fibers are stolen from the local queues of other participating threads.
    </p>
<div class="table">
<a name="fiber.performance.time_per_fiber__average_over_100_000_"></a><p class="title"><b>Table&#160;1.3.&#160;time per fiber (average over 100,000)</b></p>
<div class="table-contents"><table class="table" summary="time per fiber (average over 100,000)">
<colgroup>
<col>
<col>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
              <p>
                fiber (4C/8T, work stealing, tcmalloc)
              </p>
            </th>
<th>
              <p>
                fiber (4C/8T, work stealing)
              </p>
            </th>
<th>
              <p>
                fiber (4C/8T, work sharing, tcmalloc)
              </p>
            </th>
<th>
              <p>
                fiber (4C/8T, work sharing)
              </p>
            </th>
<th>
              <p>
                fiber (1C/1T, round robin, tcmalloc)
              </p>
            </th>
<th>
              <p>
                fiber (1C/1T, round robin)
              </p>
            </th>
</tr></thead>
<tbody><tr>
<td>
              <p>
                0.13 &#181;s - 0.26 &#181;s
              </p>
            </td>
<td>
              <p>
                0.35 &#181;s - 0.66 &#181;s
              </p>
            </td>
<td>
              <p>
                0.62 &#181;s - 0.80 &#181;s
              </p>
            </td>
<td>
              <p>
                0.90 &#181;s - 1.11 &#181;s
              </p>
            </td>
<td>
              <p>
                0.90 &#181;s - 1.03 &#181;s
              </p>
            </td>
<td>
              <p>
                0.91 &#181;s - 1.28 &#181;s
              </p>
            </td>
</tr></tbody>
</table></div>
</div>
<br class="table-break"><div class="footnotes">
<br><hr style="width:100; text-align:left;margin-left: 0">
<div id="ftn.fiber.performance.f0" class="footnote"><p><a href="#fiber.performance.f0" class="para"><sup class="para">[9] </sup></a>
        Tais B. Ferreira, Rivalino Matias, Autran Macedo, Lucio B. Araujo <span class="quote">&#8220;<span class="quote">An
        Experimental Study on Memory Allocators in Multicore and Multithreaded Applications</span>&#8221;</span>,
        PDCAT &#8217;11 Proceedings of the 2011 12th International Conference on Parallel
        and Distributed Computing, Applications and Technologies, pages 92-98
      </p></div>
</div>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright &#169; 2013 Oliver Kowalke<p>
        Distributed under the Boost Software License, Version 1.0. (See accompanying
        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
      </p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="integration/deeper_dive_into___boost_asio__.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="performance/tweaking.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>
